Design of tree-based context clustering for an HMM-based Thai speech synthesis system
نویسندگان
چکیده
This paper proposes an approach to improving the correctness of tone of the synthesized speech which is generated by an HMM-based Thai speech synthesis system. In the tree-based context clustering process, tone groups and tone types are used to design four different structures of decision tree including a single binary tree structure, a simple tone-separated tree structure, a constancy-based-tone-separated tree structure, and a trend-based-tone-separated tree structure. A subjective evaluation of tone correctness is conducted by using tone perception of eight Thai listeners. The simple tone-separated tree structure gives the highest level of tone correctness, while the single binary tree structure gives the lowest level of tone correctness. Moreover, the additional contextual tone information which is applied to all structures of the decision tree achieves a significant improvement of tone correctness. Finally, the evaluation of syllable duration distortion among the four structures shows that the constancy-based-toneseparated and the trend-based-tone-separated tree structures can alleviate the distortions that appear when using the simple tone-separated tree structure.
منابع مشابه
Tone Question of Tree Based Context Clustering for Hidden Markov Model Based Thai Speech Synthesis
Problem statement: In HMM-based Thai speech synthesis, tone is an important issue that brings about the intelligibility of the synthesized speech. Tone distortion resulted from imbalance of the training data should be appropriately treated. Approach: This study described an HMM-based speech synthesis system for Thai language. In the system, spectrum, pitch and state duration are modeled simulta...
متن کاملAnalysis of Decision Trees in Context Clustering of Hidden Markov Model Based Thai Speech Synthesis
Problem statement: In Thai speech synthesis using Hidden Markov model (HMM) based synthesis system, the tonal speech quality is degraded due to tone distortion. This major problem must be treated appropriately to preserve the tone characteristics of each syllable unit. Since tone brings about the intelligibility of the synthesized speech. It is needed to establish the tone questions and other p...
متن کاملImplementation and evaluation of an HMM-based Thai speech synthesis system
This paper describes a novel approach to the realization of Thai speech synthesis. Spectrum, pitch, and phone duration are modeled simultaneously in a unified framework of HMM, and their parameter distributions are clustered independently by using a decision-tree based context clustering technique with different styles. A group of contextual factors which affect spectrum, pitch, and state durat...
متن کاملImprovement of Tone Intelligibility for Average-Voice-Based Thai Speech Synthesis
Problem statement: Tone intelligibility in speech synthesis is an important attribute that should be taken into account. The tone correctness of the synthetic speech is degraded considerably in the average-voice-based HMM-based Thai speech synthesis. The tying mechanism in the decision tree based context clustering without appropriate criterion causes unexpected tone neutralization. Incorporati...
متن کاملTowards the Development of Speaker-Dependent and Speaker-Independent Hidden Markov Model-Based Thai Speech Synthesis
Problem statement: Tone distortion in Thai languages can deteriorate not only the intelligibility of speech but also its naturalness. Therefore, the correctness of tone must be carefully taken into account in continuous speech synthesis. The preliminary work confronted this problem when applying HMM-based speech synthesis to Thai. Approach: This study presented a study on speaker-dependent and ...
متن کامل